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Computer projiiams can be written many diU'erenl ways and still achieve the same 
eli’ect. Until recently, programmers have had litile reason to favor one method of ex- 
pressing axle over anolhei. We have come to learn, however, that functionally 
equivalent piograms can have extremely important stylistic dilferences. 

Ciotxl programming style cuts across application ureas, technique and language. 
Programs written with good style are easier to read and understand, and often smaller 
and more efficient, than those written badly. Yet few programmers have ever been 
taught what style is, as we can see from even cursory inspection of their code, liven the 
techniques of structured programming do not ensure that code will be good; “structured" 
programs can be just as bad as their unstructured counterparts. 

This paper is a survey of some aspects of programming style, primarily expression 
and structure, showing by example what happens when principles of style are violated, 
and what can be done to improve programs. To add the ring of truth to our discussion, 
the examples are all taken verbatim from programming textbooks. 

A ’ey words and Phrases: programming style, structured programming, control-flow struc- 
tures. 

CR Categories: 2.49, 4.0, 4.6 


I. INTRODUCTION 


Five or ten years ago, if you had asked 
someone what good programming style was, 
you would likely have received (if you didn’t 
get a blank stare) a lecture on 

1) how to save microseconds. 

2) how to save words of memory. 

3) how to draw neat flowcharts. 

4) how many comments to write per line of 
code. 

But our outlook has changed in the last 
few years. E.W. Dijkstra [41 argues that pro- 
gramming is a job for skilled professionals, not 
clever puzzle solvers. While attempting to 
prove the correctness of programs, he found 
that some coding practices were so dillicult to 
understand that they were best avoided. His 
now famous letter, “Go To Statement Con- 
sidered Harmful” [5], began a debate, not yet 
completed, on how to structure programs 
properly. (See [10, 14], for instance.) 


Harlan D. Mills [1], using chief prograi 
mer teams and programming with just a h 
well understood control structures (which d 
nut include the GOTO), was able to report II 
the on-time delivery of a large applicatf 
package with essentially no bugs. Clearly, 
such results could be consistently reproduce 
programming would be raised from the slai 
of black art. 

The final word is not yet in on how bi 
to write code. G.M. Weinberg [13], approac 
ing the problem as both a psychologist and 
programmer, is studying what people do \u 
and what they do badly, so we can have 
more objective basis for deciding what pi 
gramming tools to use. Programmi, 
languages are still evolving as we learn whi. 
features encourage good programming [i 
We have learned that the way to make pi 
grams more efiicient is usually by changing , 
gorithms, not by writing very tight but i. 
comprehensible code [8], And people wl 
continue to use GOTOs, out of preference 
necessity, are at least thinking more careful 
about how they use them [9[. 
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We feel, however, that programming 
style goes beyond even these considerations. 
While writing The Elements of Programming 
Style [7], we reviewed hundreds of published 
programs, in textbooks and in the recent 
literature. It is no secret that imperfect pro- 
gramming practice is common — we found 
plenty of evidence of that. There are even 
bug-infested and unreadable “structured” pro- 
grams. Writing in teams, using proper struc- 
tures, avoiding GOTOs — all are useful in- 
gredients in the manufacture of good code. 
But they are not enough. 

Today, if you asked someone what good 
programming style is, you would (or should!) 
get quite a different lecture, for we now know 
that neat flowcharts and lots of comments 
can’t salvage bad code, and that all those mi- 
croseconds and bytes saved don’t help when 
the program doesn’t work. Today’s lecture on 
“What is good programming style?” would 
probably be more like this... 


Expression: 

At the lowest level of coding, individual 
statements and small groups of statements 
have to be expressed so they read clearly. 
Consider the analogy with English — if you 
can’t write a coherent sentence, how will you 
pul together paragraphs, let alone write a 
book? So if your individual program state- 
ments are incoherent and unintelligible, what 
will your subroutines and operating systems 
be like? 

Structure: 

The larger structure of the code should 
also read clearly — it should hang together 
the same way a paper or a book in English 
should. It should be written with only a 
handful of control-flow primitives, such as if- 
then-else, loops, statement groups (begin-end 
blocks, subroutines), and it probably 
shouldn’t contain any GOTOs. This is one as- 
pect of what we mean by structured program- 
ming. Coding in this set of well-behaved 
structures makes code readable, and thus 
more understandable, and thus more likely to 
be right (and incidentally easier to change and 
debug). 

The data structure of a program should 
be chosen with the same care as the control 
flow. Choose a data representation that makes 
the job easy to program: the program 
shouldn’t have to be convoluted just to get 
around its data. 

Robustness: 

A program should work. Not just on the 
easy cases, or on the well-exercised ones, but 
all the time. It should be written to defend it- 
self against bad data from the outside world. 
“Garbage in, garbage out” is not a law of na- 
ture; it just means that a programmer shirked 
his responsibility for checking his input. Spe- 
cial cases should work — the program should 
behave at its boundaries. For instance, does 
the sorting program correctly sort a list with 
just one element? Does the table lookup 
routine work when the table is empty? 
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Kiimph't * 

Efficiency and Instrumentation: k 

Only now should the Icctu.c on style get around .o “ell.ccncy. Not 
how fast a program runs or l.ow much inemory it la us >ut un t c a <• . changing 

'w r;; a™., 

» z ; - 

alone. 

Documentation: . . 

If vou write code with care in the detailed expression, using the lundamcntul uuu.c ■ 

cotie. 

sion and structure, with occasional digression s ,o mb^n-, ^“’lully discussed in 171 

gotten, we have chosen a set . ^ ^ criticism of -exit 

Su“wc X are 00 .H human, and it is all too easy to introduce shortcomings into programs.) 

. . ,, | Fortran and PL/T none contain particularly dilheult construction. 

„ y j s: — - »= w « j s “XX" 

.he examples wi.hou. di«taill,. The principles ihuslr.iecl .re npplis.blc m .11 IWW- 
II. EXPRESSION 

Our Norite example, the one we feel best underlines the need for something that c„ 
only be called good style, is this three line Fortran program. 

DO 1 1=1, N 
DO 1 J=1 ,N 

i x(i,J)=(i/J)*(J/D 

h is an interesting experiment to ask a group of™* »t l 

they know what'ft does Len^ur a minute of study, there are still puzzled looks. When „ 
group is quizzed, one finds that only a few actually got the correct answer. 

What does it do? I. J-j- • - .‘“zero p, 

dS "applns 'to be one. Thus the code puts ones on the diagonal of X and zeros eve, 

WhCf ctver? Certainly, but it hardly qualifies as a ^pi^dfeo^in^n^ ^ 

short nor fast, despite its terse representation >n Fortran. f . J faster 

because “ mus. sued, 

buT. s- "J rX ma. uses .be md.ix. !. « 1» — ™po„.u, » make .be c, 
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clear, so people can debug, maintain and modify it. 

There is a principle of style in English that says, "Say what you mean, as simply and 
directly as you can.” The same principle applies to programming. We mean that if 1 equals J, 
X(I,J> should be 1; il l is not equal to J, X(I,J) should be zero. So say it: 

I 

DO 20 I = 1, N 
DO 10 J = 1, N 

I F ( I .EQ. J ) X ( I , I ) = 1.0 

I F ( I .NE. J ) X ( I , J ) - 0.0 

10 CONTINUE 
20 CONTINUE 


If this proves to be too "inefficient”, then it may be refined into a faster but somewhat less clear 
version: 


DO 20 I = 1, N 
DO 10 J = 1 , N 
10 X( I ,J) = 0.0 

20 X ( I , I ) = 1.0 

It is arguable which of these is better, but both are better than the original. Don’t make debug- 
ging harder than it already is — don’t be too clever. 


Being too Complicated 

Here is another Fortran example, which is an interesting contrast with the previous one: 


I F ( X .LT. Y) GO TO 30 
IF (Y .LT. Z) GO TO 50 
SMALL = Z 
GO TO 70 

30 IF (X .LT. Z) GO TO 60 
SMALL = Z 
GO TO 70 
50 SMALL = Y 
GO TO 70 
60 SMALL = X 
70 ... 


Ten and a half lines of code are used, with four statement numbers and six GOTOs — surely 
something must be happening. Before reading further, test yoursell. What does this program 


The mnemonic SMALL is a giveaway — the sequence sets SMALL, to the smallest or X, 
Y, and Z. Where the first example was too clever, this one is too wordy and simple-minded. 
Since this code was intended to show how to compute the minimum of three numbers, we 
should ask why it wasn’t written like this: 


SMALL = X 

I F ( Y .LT. SMALL ) SMALL = Y 
I F ( Z .LT. SMALL ) SMALL = Z 


No labels, no GOTO’s, three statements, and clearly correct. And the generalization to comput- 
ing the minimum of many elements is obvious. 
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Of course, if our goal is to get the job done, rather than teaching how to compute a 
minimum, we can write, much mute readably than the original, the single stale, net 

SMAll AMI NO ( X , Y, Z) 

One line replaces ten. How can a piece of code that is an order of magnitude loo large be 
considered reliable? There is that much greater chance tor contusion, and hence or the 1 n.io 
duction of bugs. There is that much more that must be understood in order to make cuilulim, 

ary changes. 

Clarity versus " Efficiency " 

It seems obvious that a program should be clear, yet clarity is olten sacrificed needlessly m 
the name ol etlieiency or expediency. 

00 10 1-1 ,M 

1 F(BP( I ) IT .0) 19 , 1 1 , 10 
11 I BN1 ( I ) -- BL.NK 

I BN2 ( I ) BINK 
GO 10 10 
19 BP ( I ) “ -1 .0 
IBNl(l) = B 1. NK 
I B N 2 ( I ) = B L NK 
10 CONTINUE 

If BP(I) is less than or equal to -1, this excerpt will set BP(1> to -1 and put blanks in IBM (It 
and 1BN2U) The code uses a hard-to-read Fortran arithmetic IF that blanches thac ways 0 
almost -duplicated pieces of code, two labels and an extra GOTO, all ,0 avoid setting BP(1) to - I 

if it is already -1. 

There is no need to make a special case. Write the code so it can be read: 

DO 10 I = 1 , M 

| F ( BP ( I) .GT. -1.0 ) GOTO 10 
BP ( I ) =-1.0 
IBNl(l) = BLNK 
I BN2 ( I ) = BLNK 
10 CONTINUE 

Interestingly enough our version will be more “efficient” on most machines, both in space am. 
n time- although we may reset BP(1) unnecessarily, we do less bookkeeping. What did conce , 
with “emciency” in the original version produce, besides a bigger, slower, and more obscui, 

program? 

Rewriting , , ,, 

These may seem like small things, taken one at a time. But look what happens when t v 
need for clear expression is consistently overlooked, as in this PL/I program which computes . 

.!» inW.1 .1 XI between »o end one. * «M,n, » U» — " 
rectangles of various widths. 
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TRAPZ : PROCtDURE OPTIONS (MAIN); 

DECLARE MSSG1 CHARACTER (20); 

MSSG1 = 'AREA UNDER THE CURVE'; 

DECLARE MSSG2 CHARACTER (23) ; 

MSSG2 = 'BY THE TRAPAZO I DAL RULE'; 

DECLARE MSSG3 CHARACTER (16); 

MSSG3 = ' FOR DELTA X = 1/' ; 

DECLARE I FIXED DECIMAL (2); 

DECLARE J FIXED DECIMAL (2); 

DECLARE L FIXED DECIMAL (7,6); 

DECLARE M FIXED DECIMAL (7,6); 

DECLARE N FIXED DECIMAL (2); 

DECLARE AREA1 FIXED DECIMAL (8,6); 

DECLARE AREA FIXED DECIMAL (8,6); 

DECLARE LMTS FIXED DECIMAL (5,4); 

PUT SKIP EDIT (MSSG1 ) ( X ( 9 ) , A( 20 ) ) ; 

PUT SKIP EDIT (MSSG2 ) ( X ( 7 ) , A( 23 ) ) ; 

PUT SKIP EDIT (' ') (A( 1 ) ) ; 

AREA = 0; 

DO K = 4 TO 10; 

M = 1 / K ; 

N = K — 1 ; 

LMTS = .5 * M; 

I - l; 

DO J = 1 TO N; 

L = (I / K) ** 2; 

AREA1 = .5 * M * (2 * L); 

A R F A = AREA + AREA1 ; 

IF I = N THEN CALL OUT; 

ELSE 1=1+1; 

END; 

end; 

OUT: PROCEDURE; 

AREA = AREA + LMTS; 

PUT SKIP EDIT (MSSG3 ,K , AREA) ( X( 2 ) , A( 1 6 ) , F ( 2 ) , X ( 6 ) , 

F(9 ,6)) ; 

AREA = 0; 

RETURN; 

END; 

END; 

Everything about this program is wordy. The output messages are declared and assigned 
unnecessarily. There are far too many temporary variables and their associated declarations. 
The structure sprawls. 

Try going through the code, fixing just one thing at a time — put the error messages in the 
PUT statements where they belong. Eliminate the unnecessary intermediate variables. Com- 
bine the remaining declarations. Simplify the initializations. Delete the unnecessary procedure 
call. You will find that the code shrinks before your very eyes, revealing the simple underlying 
algorithm. 

Here is our revised version: 
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;un> 


TRAPZ: PROCEDURE OPT I ONS ( MA IN); 

DECLARE ( J , K ) FIXED DECIMAL (2), 

AREA F IXED DECIMAL (8,6); 


PUT SKIP EDI I ('AREA UNDER THE CURVE', 

'BY THE TRAPEZOIDAL RULt' ) 
( X ( 9 ) , A, SKIP, X ( 7 ) , A); 

PUT SKIP; 


DO K =- 4 TO 10; 

AREA = 0.5/K; 

DO J - 1 TO K — 1 ; 

AREA = AREA + ( ( J /K ) * * 2 ) /K ; 
END; 


PUT SKIP EDIT ('FOR DELTA X— 1 / ' , K, 
( X ( 2 ) , A, F ( 2 ) , X ( 6 ) 

END; 

END TRAPZ; 


AREA) 

F ( 9 , 6) ) ; 


Both versions give the same results, so this was not an exercise in debugging in thetnuh 

be in charge of, when changes are necessary. 

The original version reads like a hasty first draft which was later patched. Arriving at oi. 

° nLL ' Programmers sometimes say that they haven’t lime to worry about niceties like styie 
will soon find that, with practice, you spend less and less time rev.stng, because you do a 

and better job the first time. , nn , 

Much more can be said about how to make code locally more readable (see 17 bu 
now Te will turn to a topic that has recently become popular - how to specify control 
good style. 


Ill CONTROL FLOW STRUCTURE 

IhL is a narrow view, we will keep to just that aspect for the time being. 

It has been shown [2] that programs can be written using just: 

1) Alternation, such as 1F-THEN-ELSE, where the ELSE part may be optional. 

2) Looping, such as WHILE or the Fortran DO loop. Different fiavors have the terminal, 
test at the beginning or end of the loop. 
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3) Grouping, such as subroutines and compound statements. 

While these tools are sulficient, in the same sense that a Turing machine can perform any of a 
wide class of calculations, it is convenient to add: 

4) CASE switches, which are essentially multi-way IF statements, and 

5) BREAK and ITERATE statements, which exit from a loop or skip to the test portion of a 
loop, respectively. 

Most languages have at best a subset of these forms, so the pragmatic programmer cannot 
hope to avoid the more primitive control statements carried over from earlier days. For exam- 
ple, the simplest way to implement a BREAK in PL/l is to use a GOTO. And in Fortran, of 
course, GOTOs and statement numbers must be sprinkled liberally throughout the best designed 
code. But the basic design of a program should be done in terms of the fundamental structures. 
GOTO’s and other primitive language features should be used only to implement the basic struc- 
tures outlined above. 

While these are well tried and useful forms, there is a tendency to believe that just by us- 
ing them (and only them) one can avoid all trouble. This is false — they are not panaceas. 
Good style, care and intelligence are still needed. We can see this just by studying the use and 
abuse of the IF-THEN-ELSE, certainly a simple and fundamental structure in any programming 

language. 

Null THEN 

The following routine is supposed to sort an array of eight numbers into increasing order 

of absolute value: 

DCL A ( 8 ) ; 

GET LIST (A); 

DO 1=1 TO 8; 

IF ABS(A( I ) ) <A B S ( A ( 1+1)) THEN; 

ELSE BEGIN; 

ST0RE=A( I ) ; 

A( I )=A( HI); 

A( 1 + 1 )=S TORE ; 

END; 

END; 

PUT LIST(A); 

The heart of this sequence is a “DON’T” statement — if the specified condition is true, do 
nothing, otherwise do something. Anything so misleading should put us on guard; and indeed 
we see immediately that the sequence cannot possibly sort correctly because 

1) only one pass is made over the array, and we know simple sorting takes about N passes. 

2) a reference is made outside array bounds when A(I+1) is accessed on the last iteration 
with I equal to 8. 

There are several ways of doing a simple sort correctly. We could make N-l passes over 
the array, or we could set a flag every time it is necessary to exchange two elements, so we 
know that an additional pass over the array is needed. Applying this latter fix to the program 
above (and eliminating the subscript range error) should give us a working sort. 

But there is still a lurking bug. Turning the test around so the IF-THEN is stated more 
naturally: 
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If ABS(A(0) >'- ABS ( A( Ml ) ) 1 HEN DO; 


STORE 
A( I ) - 
A(H1) 
EXCH = 


a ( i ) ; 
a( Hi); 

store; 

'l'u; 


reveals that two elements will be exchanged even if they are equal. If A contains 

ments the program goes into art infinite loop exchanging them, because b . 

set "repeatedly. ^ Using a null THEN may seem a small thing, until ,t adds a day o, debugging 

t,me ' Even when code is correct, it can be very hard to read. Here’s another sorting program, 
which sorts into descending order this time, with an almost-null THEN. 


DO M - 1 TO N ; 

K = N-l ; 

DO 1 = 1 ro K; 

I F ARAY ( 1 ) - ARAY(Hl) >= 0 
THEN GO TO RETRN; 

ELSE ; 

SAVE = ARAY ( J) ; 

ARAY(J) - ARAY ( J+l ) ; 

ARAY ( 1 t 1) = SAVE ; 

RETRN: END; 

END; 

_. , Tit pm fiOTO mieht be a BREAK statement in disguise, but often it is a 

Si— —ssri: 

into PL/I. Revision is easy: 


DO M = 1 TON-1; 

DO J = 1 TO N-l ; 

IF ARAY(J) < ARAY ( J+l ) THEN DO; 

SAVE = ARAY(J); 

ARAY(J) = ARAY ( J+l ) ; 

ARAY ( J+l ) = SAVE; 

END; 

END; 

END; 

The original program worked, but again we were able to improve it with little effort. 

In Fortran there are fewer options when using IFs, for there is no ELSE clause and no 
way form c^Sd groups of statements. But in the few cases where the language lets you 

write clearly, do so. Don’t write like this. 

IF (A(I).GT.GRVAL) GO TO 30 
GO TO 25 

30 GRVAL = A( 1) 

25 ... 
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31'; » 

A branch around the branch that branches around what we wanted to do in the first place! Say 
what you mean, as simply and directly as you can: 

1 F ( A ( I ) .GT. GRVAL ) GRVAL = A(l) 

There are now no labels, no GOTOs, and the code can be understood even v. hen read aloud 
over a telephone, (This is always a good test to apply to your code - if you can't understand it 
when spoken aloud, how easy will it be to grasp when you read it quietly to yourself?) 

ELSE BREAK 

The BREAK statement has its uses, but it has to be used judiciously. Consider this se- 
quence for finding the largest of a set of positive numbers: 

DCL NEWIN DEC FLOAT (4) ; 

LARGE DEC FLOAT (4) INIT ( . OE 1 ) ; 

/* .0 x 10**1 =.0x10=0.0 */ 

NEXT_C : GET LIST (NEWIN); 

IE NEWIN >=0 

THEN IF NEWIN > LARGE 

THEN LARGE = NEWIN; 

ELSE GO TO NEXT _C ; 

ELSE GO TO FINISH; 

GO TO NEXT _C ; 

FINISH: PUT LIST (LARGE); 

Ignoring the curious zero in the INIT attribute, and the equally curious explanatory comment, 
we can see that this program does indeed use just the structures we mentioned above (the 
GOTOs implement BREAKS and ITERATES). Therefore it should be readable. But tracing the 
tortuous flow of control is not a trivial exercise — how does one get to that last GOTO 
NEXT_C? Why, from the innermost THEN clause, of course. 

The ELSE BREAK is just as confusing as the DON’T statement. It tells you where you 
went if you didn’t do the THEN, leaving you momentarily at a loss in finding the successor to 
the THEN clause. And when ELSE BREAKS are used one after the other, as here, the mind 
boggles. 

Such convolutions are almost never necessary, since an organized statement of the prob- 
lem leads to a simple series of decisions: 


DECLARE (NEWIN, LARGE) DECIMAL FLOAT (4); 

LARGE = 0; 

N E XT C : GET LIST (NEWIN); 

IF NEWIN > LARGE THEN LARGE = NEWIN; 

IF NEWIN >= 0 THEN GOTO NEXT C; 

PUT LIST (LARGE); 

What we have here is a simple DO-WHILE, done while the number read is not negative, 
controlling a simple IF-THEN. Of course we have rearranged the order of testing, but the end- 
of-data marker chosen was a convenient one and does not interfere with the principal work of 
the routine. True, our version makes one extra test, comparing the marker against LARGE, but 
that will hardly affect the overall efficiency of the sequence. Readability is certainly improved 
by avoiding the ELSE GOTOs. 
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THEN II- 

Now consider: 


IF QTY > 10 THEN 

IF QTY > 200 THEN 

If QTY >~ 500 1 HEN U I L L _ A 


/’A*/ 

/*B*/ 

BILL A + 1.00; / * C * / 


ELSE BILL _A = B I L L _ A + .50; /*C*/ 

ELSE; 

El SE B I L L A = .00; ' A ' 

Those letters down the right hand side are designed to help you ligurq.out what is going on, but 
as usual no amount of commenting can rescue bad code. The code requires you to maintain a 
mental pushdown stack of what tests were made, so that at the appropriate point you can pop 
them until you determine the corresponding action (it you can still remember). might tunc 
yourself as you determine what this code does when QTY equals 350. How about Is . 

Since only one of a set of actions is ever called for here, a frequent occurrence, what we 
really want is some form of CASE statement. In PL/I, the most general C ASF. is implemented 
by a series of ELSE-IFs: 


IF condl THEN first case; 

ELSE I F co ml2 THEN second case ; 

ELSE IF condn THEN nth case; 

ELSE default; 

If there is no default action, the last ELSE clause is omitted. We can rewrite the example as: 

IF QTY >= 500 THEN B I LL_A = B I LL_A + 1.00;. 

ELSE IF QTY > 200 THEN B I L L _A = BILL_A + 0.50; 

ELSE IF QTY <= 10 THEN BILL _A = 0.0; 

Now all we need do is read down the list of tests until we find one that is met, read across to 
the corresponding action, and continue after the last ELSE. In Fortran, this can be rendered 
similarly as 

I F ( OTY .GE. 500.0 ) BILLA = BILLA +1.0 

I f ( QTY .LT. 500.0 .AND. QTY .GT. 200.0) BILLA = BILLA + 0.5 
I F ( QTY .LE. 10.0) BILLA = 0.0 

which is best if the relations and actions are simple enough to write one per line and the tests 
are mutually exclusive. Don’t let anyone tell you this is not efficient - it doesn t take all that 
much time to make the whole set of tests, and you’re more likely to get the code right the first 
time. If it does take too much time, and you have measurements that prove it, then and only 
then should you re-write it with GOTOs. 

The THEN-1F was the culprit in this example, but we could have given the disease anoth- 
er name Note the null ELSE clause, required to make the unstacking come out right when one 
of the conditions has no corresponding action. These seemingly useless statements cauterize the 
stumps of any ill-thought-out TIIEN-IFs buried in the code. A program containing null ELM 
clauses is suspect, if for no other reason than that it was written by someone bitten by THEN- 
IFs often enough to sprinkle null ELSEs around for insurance. 
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The THF.N-IF does have its uses. It is often the only way to ensiue that tests with side 
effects are performed in the proper order, as in 

I F I > 0 THEN 

IF A ( I ) = B ( I ) THEN . . . 

which ensures that I is in range before its use as an index. Some languages provide special 
Boolean connectives [111 which guarantee left-to-right evaluation and early exit as soon as the 
truth value of the expression is determined; but if you are not fortunate enough to be able to 
program with these useful tools, use THEN-lFs and don’t forget to cauterize. 

Bushy Trees 

Most of the IF-THEN-ELSE examples we have shown so far have a characteristic in com- 
mon, besides the unreadable practices we pointed out. Each approximates, as closely as the pro- 
grammer could manage, a minimum depth decision tree for the problem at hand. If all out- 
comes have equal probability, such a tree arrives at the appropriate action with the minimum 
number of tests on the average, so we are all encouraged to lay out programs accordingly. But a 
program is a one-dimensional construct, which obscures any two-dimensional connectedness it 
may have. Perhaps the minimum depth tree is not the best structure for a reliable program. 

Let us rewrite the minimum function in PL/I, adhering to the spirit of the original Fortran, 
but using only IF-THEN-ELSEs: 

IF X >= Y THEN 

IF Y >= Z THEN SMALL = 2; 

ELSE SMALL = Y; 

ELSE 

IF X >= Z THEN SMALL = Z; 

ELSE SMALL = X; 

Even though neatly laid out and properly indented, it is still not easy to grasp. Not all the con- 
fusion of the original can be attributed to the welter of GOTOs and statement numbers. What 
we have here is a “liushy” tree, needlessly complex in any event, but still hard to read simply 
because it is conceptually short and fat. 

The ELSE-1F sequence, on the other hand, is long and skinny as trees go; it seems to 
more closely reflect how we think. (Note that our revised minimum function was also linear.) 
It is easier to read down a list of items, considering them one at a time, than to remember the 
complete path to some intenor part of a tree, even if the path has only two or three links. Sel- 
dom is it actually necessary to repeat tests in the process of stringing out a tree into a list; often 
it is just a matter of performing the tests in a judicious order. Yet too often programmers tend 
to build a thicket of logic where a series of signposts are called for. 

Summary of IF-THEN-ELSE 

Let us summarize our discussion of IF-THEN-ELSE. The most important principle is to 
avoid bushy decision trees like: 

IF ... 

THEN IF ... 

ELSE ... 

ELSE IF ... 

THEN ... 

ELSE ... 
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I 

The bushy tree should almost always be reorganized into a CASE statement, which is mi 
t piemen ted as a string of ELSElE’s in PL/1. The resulting long thin tree is much easier to 
1 understand: 

I 

IF ... THEN . . . 

ELSE IF ... THEN . . . 


| A THEN-1F is an early warning that a decision tree is growing the wrong way. A null ELSE in- 
j dicates that the programmer knows that trouble lies ahead and is trying to del end against ii 
; And an ELSE BREAK from such a structure may leave the reader at a loss to understand how 
the following statement is reached. 

A null THEN or (more commonly) THEN GOTO usually indicates that a relational test 
needs to be turned around, and some set of statements made into a block. 

The general rule is: after you make a decision, do something. Don’t just go somewhere or 
make another decision. If you follow each decision by the action that goes with it, you can sec 
at a glance what each decision implies. 

WHILE 

Looping is fundamental in programming. Yet explicit loop control in Fortran or I’l./I can 
only be specified by a DO statement, which encourages the belief that all loops involve repeated 
incrementing of an integer variable until it exceeds some predetermined value. Fortran further 
insists that the loop body be obeyed once before testing to see whelher the loop should have 
been entered at all. 

Thinking in terms of DO statements, instead of loops, leads to programs like this sine 
routine: 

DOUBLE PRECISION FUNCTION S I N( X , E ) 

C THIS DECLARATION COMPUTES SIN(X) TO ACCURACY E 
DOUBLE PRECISION E, TERM, SUM 
REAL X 
TERM=X 

DO 20 1=3,100,2 
TERM=TERM* X* * 2/( I * ( 1-1 ) ) 

I F ( TERM. LT . E) GO TO 30 
SUM=SUM+(-l**( I /2 ) ) * TERM 
20 CONTINUE 
30 S I N=SUM 
RETURN 
END 

The program consists entirely of a loop, which computes and sums the terms ol a Maclaurin 
series until the terms get too small or a predetermined number have been included in the sum. 

In its most general form, a loop should be laid out as: 
initialize 

while (reason for looping) 
body ol loop 

This way, the parts are clearly specified and kept separate. But this approach was evidently not 
taken here: 
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1) The program fails to initialize SUM along with 1 ERM aiul I. 

2) The program mis-states the convergence test, returning immediately on negative values of 
X. 

3) The convergence test is misplaced, so the last I LKM computed is not included in SUM. 
And TERM is computed unnecessarily when the convergence test is met right from the 
start. 

These three bugs can be traced directly to pour structural design. There is also a fourth 

bug: 

4) TERM is computed incorrectly because the “**” operator binds tighter than unary minus 
(another case of being too clever?). 

We first write the code in an anonymous language that includes the WlliLH. 


s i n = x 
t e rm = x 
i - 3 

while ( i < 100 & ab s ( t e rm) > e ) 

t e rm = -t e rm * x* *2 /(i * (i - 1)) 
sin = sin + term 
i = i + 2 
return 

and then translate into Fortran: 


SIN = X 
TERM = X 

DO 20 I = 3, 100, 2 
IF (DABS(TERM) .LT. E) GOTO 30 

TERM = -TERM * X**2 / FLOAT ( I * (I - 1)) 

SIN = SIN + TERM 
20 CONTINUE 
30 RETURN 

In this case, the WHILE becomes a DO followed by an IF. The DO neatly summarizes 
the initialization, incrementing, and testing of I, and keeps the loop control separate from the 
computation. It is a useful statement. The important thing is to recognize its shortcomings and 
plan loops in terms of the more general WHILE. 

In PL/I, the DO-WHILE and DO I=J TO K constructions make the test at the top of the 
loop, which is most often what is wanted. Fortran programs,, on the other hand, frequently fail 
to “do nothing gracefully” because DO loops insist on being performed at least once, regardless 
of their limits, even when action is undesirable. For example, this function finds the smallest 
element in an array. 


FUNCTION SMALL ( A, N) 
DIMENSION A ( 1 ) 

SMALL = A( 1 ) 

DO 1 K = 2 ,N 
I F ( A( K ) - SMALL ) 2,1,1 
2 SMALL = A(K) 

1 CONTINUE 

RETURN 
END 
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Clearly it’s more ciiicienl to use the IX) hunts ol “2,N” - it saves a useless comparison but 
what if N is one? Don’t kid yourself: N will he equal to one some day, and the Wugi.ni. « i I 
surely fail when it looks at the undefined A<2). Had esc first written tins routine with a W Hll F. 
statement, we would have seen the need for an IF to protect the DO m the translated vers, on. 
Or, we could have written directly: 

SMALL = A( 1 ) 

DO 1 K = 1 , N 

I F ( A ( K ) .LT. SMALL ) SMALL => A(h) 

1 CONTINUE 

This may be less "efficient” in the small, but the cost of finding the bug in the original, and 
repairing the damage it cost, will certainly outweigh the few microseconds mure that out 'vision 
takes. (You have to weigh for yourself the question ol whether to test tl N is less than one.) 

IV. DATA STRUCTURE 

Putting the hard parts ol' a program into an appropriate data structure is an art, hut well 
worthwhile. (Imagine doing long division in Roman numerals.) Tins program convetls the yeat 
and day of the year into the month and day of the month: 


DATES: PROC OPTIONS (MAIN); 

?r,S5?J < ' IYLAR < 0 THEN RE1UHN; 

IF IDATE <= 31 THEN GO TO JAN; 

I = IYEAR/400; IF I = I YEAR/400 THEN GO TO LEAP; 

I = IYEAR/100; IF I = IYEAR/100 THEN GO TO N01EAP; 

I = I YEAR/4 ; IF I = I YEAR/4 THEN GO TO LEAP; 

NOLEAP: L = 0; 

IF IDATE > 365 THEN RETURN; 

LEAP- IF IDATE > 181 + L THEN GO TO G 1 8 1 ; 

IF IDATE > 90 + L THEN GO TO G90; 

IF IDATE > 59 + L THEN GO TO G59; 

MONTH = 2; IDAY = IDATE - 31; GO TO OUT; 

G59 ; MONTH = 3; IDAY = IDATE - (59 + L); GO TO OUT; 

G90- IF IDATE > 120 + L THEN GO TO G 1 2 0 ; 

MONTH = 4; IDAY = IDATE - (90 Hr L); GO TO OUT; 

G120- IF IDATE > 151 + L THEN GO TO G151; 

MONTH =5; IDAY = IDATE - (120 + L); GO TO OUT; 

G 1 5 1 : MONTH = 6; IDAY = IDATE - (151 + L); GO TO OUT; 

G 1 8 1 : IF IDATE > 273 + L THEN GO TO G273, 

IF IDATE > 243 + L THEN GO TO G243; 

IF IDATE > 212 + L THEN GO TO G212; 

MONTH = 7; IDAY = IDATE - (181 + L); GO TO OUT, 

G212 : MONTH = 8; IDAY = IDATE - (212 + L); GO TO OUT; 

G 2 4 3 : MONTH = 9; IDAY = IDATE - ( 243 + L), GO TO OUT, 

G27 3 : IF IDATE > 334 + L THEN GO TO G 3 3 4 ; 

IF IDATE > 304 + L THEN GO TO G304; 

MONTH = 10; IDAY = IDATE - (273 + L); GO TO OUT, 

G304: MONTH = 11; IDAY = IDATE - (304 + L ; GO TO OUT; 

G3 34 : MONTH = 12; IDAY = IDATE - ( 334 + L), 

OUT: PUT DATA (MONTH, IDAY, IYEAR) SKIP; 

GO TO READ; 

JAN: M0NTH=1 ; I DAY=I D ATE ; GO TO OUT; 

END DATES; 
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What we have here is a bushy tree to end all bushy hues. The rococo structure of the calendar 
is intimately intertwined with the control flow in an attempt to arri\e at the proper answer with 
a minimum number of tests. 

Clarity is certainly not worth sacrificing just to save three tests per access (on the aver- 
age) — the irregularities must be brought under control. Most good programmers are accus- 
tomed to using subprocedures to achieve regularity. The procedure body shows what is com- 
mon to each invocation, and the dilferences are neatly summarized in the parameter list for each 
call. Fewer programmers learn to use judiciously designed data layouts to capture the irregulari- 
ties in a computation. But we can see that structured programming can also apply to the data 
declarations: 

DATES: PROCEDURE OPTIONS (MAIN); 

DECLARE MONS I Z E ( 0 : 1 , 1:12) INITIAK 

31,28,31,30,31,30,31,31,30,31,30,31, /• NON-LEAP */ 

31,29,31,30,31,30,31,31,30,31,30,31); /* LEAP ♦/ 

READ: 

GET LIST (IVEAR, IDATE) COPY; 

IF MOD ( I YE AR , 4 00 ) =0 I 

( MOD ( I YE AR , 100 )->=0 & M0D( I YEAR , 4 )=0 ) 

THEN LEAP = 1; 

ELSE LEAP = 0; 

IF I YEARcl 7 53 I I YE AR>3999 I IDATE <=0 I I DATE>36 5+L EAP THEN 
PUT SKIP L I ST ( ' BAD YEAR, DATE , I YEAR , IDATE); 

ELSE DO; 

NDAYS = 0; 

DO MONTH = 1 TO 12 

WHILE ( IDATE > NDAYS + MONS I ZE ( LEAP , MONTH) ) ; 
NDAYS = NDAYS + MONS I ZE ( LEAP , MONTH); 

END; 

PUT SKIP L I ST ( MONTH , IDATE - NDAYS, I YEAR) ; 

END; 

GOTO READ; 

END DATES; 

Most people can recognize a table giving the lengths of the different months (“Thirty days hath 
September...”), so this version can be quickly checked for accuracy. The program may take a bit 
more time counting the number of days every time it is called, but it is more likely to get the 
right answer than you are, and even if the program is used a lot, I/O conversions are sure to use 
more time than the actual computation of the date. The double computation of 
MONSIZE(LEAP, MONTH) falls into the same category — write it clearly so it works; then 
measure to see if it’s worth your while to rewrite parts of it. 

Our revised date computation shows an aspect of modularity which is often overlooked. 
Most people equate modules with procedures, but our program has several distinct modules and 
only one procedure. A date is input, LEAP is computed, the date is validated, the conversion is 
made and the result is printed. Each of these pieces could be picked up as a unit and planted as 
needed in some other environment with a good chance of working unaltered, because there are 
no unnecessary labels or other cross references between pieces. (The label and GOTO imple- 
ment a WHILE, done while there is still input.) The control flow structures we have described 
tend to split programs into computational units like these and thus lead to internal modularity. 
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V. CONCLUSION 

Three topics we have hardly touched, which are usually associated with any discussion ol 
style, are efficiency, documentation, and language design. We think these are straw men, almost 
always raised improperly in a consideration ol only parochial issues. 

Opponents of programming reform argue that anything that is readable must automatically 
be inefficient. This is the same altitude that says that assembly languages are preferable to high 
1 level languages. But as we have seen, good programming is not synonymous with GOTO-less 
programming, and it certainly does not have to be wasteful of time or space. Quite the contrary, 
we find that nearly all our revised programs take no more time and are about the same si/e as 
the originals. And in some cases the revised version is shorter and faster because unnecessary 
special cases have been eliminated. 

We use few comments in our revisions — most ol the programs are short enough to speak 
for themselves. And when a program cannot speak lor itself , it is seldom the case that greater 
reliability or understanding will result by interposing yet another insulating layer ol documenta- 
tion between the code and the reader. Bad programming practice cannot be explained away; it 
must be rewritten. 

Finally, many people try to excuse badly written programs by blaming inadequacies of the 
language that must be used. We have seen repeatedly that even Fortran can be tamed with 
proper discipline. The presence ol bad leatures is not an invitation to use them, nor is the ab- 
sence of good features an excuse to avoid simulating them as cleanly as possible. Good 
languages are nice, but not vital. 

Our survey of programming style has been sketchy, lor there are far too many details that 
must be covered to give a proper treatment here. But there is ample evidence lor the existence 
of some discipline beyond a simple set of restrictions on what types ol statements to use. It is 
called style. 
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